Efficient sampling of non-strict turnstile data streams
نویسندگان
چکیده
منابع مشابه
Feasible Sampling of Non-strict Turnstile Data Streams
We present the first feasible method for sampling a dynamic data stream with deletions, where the sample consists of pairs (k,Ck) of a value k and its exact total count Ck. Our algorithms are for both Strict Turnstile data streams and the most general Non-strict Turnstile data streams, where each element may have a negative total count. Our method improves by an order of magnitude the known pro...
متن کاملMaximum Matching in Turnstile Streams
We consider the unweighted bipartite maximum matching problem in the one-pass turnstile streaming model where the input stream consists of edge insertions and deletions. In the insertion-only model, a one-pass 2-approximation streaming algorithm can be easily obtained with space O(n logn), where n denotes the number of vertices of the input graph. We show that no such result is possible if edge...
متن کاملNew Characterizations in Turnstile Streams with Applications
Recently, [Li, Nguyen, Woodruff, STOC’2014] showed any 1-pass constant probability streaming algorithm for computing a relation f on a vector x ∈ {−m,−(m− 1), . . . ,m} presented in the turnstile data stream model can be implemented by maintaining a linear sketch A · x mod q, where A is an r×n integer matrix and q = (q1, . . . , qr) is a vector of positive integers. The space complexity of main...
متن کاملKSample: Dynamic Sampling Over Unbounded Data Streams
Data sampling over data streams is common practice to allow the analysis of data in real-time. However, sampling over data streams becomes complex when the stream does not fit in memory, and worse yet, when the length of the stream is unknown. A well-known technique for sampling data streams is the Reservoir Sampling. It requires a fixed-size reservoir that corresponds to the resulting sample s...
متن کاملWeighted Random Sampling over Data Streams
In this work, we present a comprehensive treatment of weighted random sampling (WRS) over data streams. More precisely, we examine two natural interpretations of the item weights, describe an existing algorithm for each case ([2,4]), discuss sampling with and without replacement and show adaptations of the algorithms for several WRS problems and evolving data streams.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Theoretical Computer Science
سال: 2015
ISSN: 0304-3975
DOI: 10.1016/j.tcs.2015.01.026